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ABSTRACT 

In very crowded fields, the modulation of the background by the sea of 
unresolved faint sources induces centroid shifts. The errors increase with the 
number of sources per beam. Even the most optimistic simulations of imaging 
data show that position errors can become severe (on the order of the beam size) 
at flux levels at which images contain 1/50 to 1/15 sources per beam, depending 
on the slope of the number-fiux relation dlogN/dlog S . These problems are 
expected to be significant for recent observations of faint submillimeter sources 
and may be the reason that some sources appear to lack optical counterparts. 



Subject headings: astrometry — galaxies: photometry — methods: 
observational — submillimeter — surveys 



1. Introduction 

Fainter is (usually) better when it comes to star and galaxy counts. However, there are 
fundamental limits to faint imaging from confusion which cannot be overcome by increasing 
exposure times alone. The sea of unresolved sources fainter than the detection limit creates 
a noise in the sky, which does not improve with more data. 

Many of the issues associated with this confusion noise have been discussed before (eg, 
Scheuer 1957; Condon 1974; Franceschini 1982; Hacking & Houck 1987; Barcons 1992), 
however the large number of present-day observations that are or soon will be pushing 
this confusion limit suggests a new discussion. In particular, in recent years there has 
been a concerted effort to produce very deep, multi-wavelength studies of blank sky in 
order to identify extragalactic sources as comprehensively as possible. These studies have 
been very successful, identifying populations of radio-, submillimeter-, infrared-, visual- 
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and x-ray-bright galaxies and associating them with their counterparts in other bands 
(Djorgovski et al 1995; WiUiams et al 1996; Hogg et al 1996; Rowan-Robinson et al 1997; 
Richards ct al 1998; Hughes et al 1998; Barger et al 1998; Eales et al 1999; Aussel et al 
1999; Elbaz et al 1999; Gardner et al 2000; Brandt et al in preparation; Dickinson et al 
in preparation). Some of the faintest sources in some of the most crowded fields (in the 
sense of number of sources per resolution element) have not shown clearly distinguished 
counterparts at other wavelengths (Hughes et al 1998; Small et al 1998; Barger et al 1999b). 
This raises the question "could confusion be playing a role?" 

This paper is a first attempt at characterizing position shifts due to confusion in 
astronomical images. Simulated images of crowded fields are presented, made in the most 
optimistic way: no photon noise, a perfectly understood gaussian point spread function or 
beam shape, pointlike sources, a power-law number-flux relation of known slope, and no 
angular clustering. Even with these optimistic inputs, the resulting images show that it 
is impossible to accurately measure positions and fluxes of sources that are more than an 
order of magnitude brighter than the flux level corresponding to one source per beam (a 
beam being one resolution element in the image). Recent work making use of the limit "one 
source per beam" (eg, Blain et al 1998) is therefore overly optimistic. 

The standard rule-of-thumb is that confusion becomes important at 1/30 of a source 
per beam. It is possible to get information from the statistics of the background noise fainter 
than the level of 1/30 source per beam (eg, Scheuer 1957; Condon 1974), but in terms of 
identifying and measuring individual sources, 1/30 is regarded as the limit. This paper tests 
the rule-of-thumb for confusion-induced astrometry errors, which are particularly important 
for deep, multi-wavelength studies, in which counterpart identiflcation across multiple data 
sets is important. Astrometric shifts due to confusion have been predicted and observed 
in the context of microlensing data (Goldberg 1998; Goldberg & Wozniak 1998) and are 
expected to hmit future stellar astrometry experiments (Yu et al 1993; Rajagopal & Allen 
1999). 

For the purposes of this paper a "beam" is taken to be the solid angle of the 1 cr-radius 
circle of the gaussian point spread function, or Jlbeam = ttct^. Note that for a gaussian, 
a ^ 6'fwhm/2.35 where ^fwhm is the angular full width at half- maximum of the point spread 
function. The number of sources per beam s/b at a given flux level S is the integrated 
number of sources A^(> S) in an image brighter than flux S divided by the number of 
beams in the solid angle of the image or (i^image/^beam)- 
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2. Method and analysis 

Four 512 X 512 pixel artificial images were made with sources randomly distributed over 
the image (and in fact over an area larger than the image so that image edges are realistic). 
The sources were randomly assigned fluxes in power-law distributions d log N/ d log S = —[3 
where /3 = 1.50, 1.00, 0.75, and 0.50, one value of [3 per image. Positions were randomly 
assigned and were not quantized onto the pixel grid. The point-spread function was chosen 
to be perfectly gaussian with 6'fwhm = 4 pixels, so that it is well sampled. Each image 
contains fiimagc/^bcam = 2.88 X 10^ beams. The artificial source catalogs were truncated at 
s/b = 3 sources per beam, ie, at much higher angular density than the s/b ~ 1/30 sources 
per beam rule-of-thumb. The four artificial images are shown in Figure |l| 

No noise was added to the images; the sources are not extended beyond the gaussian 
beam shape; and the sources were not given any angular clustering. The artificial images 
represent high optimism. 

The background levels in the four images were fit by sigma- clipping (ie, iteratively 
removing outlier pixels) at 3-sigma and they were subtracted from the images. Although 
the input background levels were zero, the fit levels are above zero for all of the images 
because of the integrated flux from all the unresolved sources. In almost all real observations 
but especially at wavelengths longer than the near-ultraviolet (ground-based) or visual 
(space-based), images contain large DC levels from sky emission or telescope thermal 
emission so the true background (or, more accurately, foreground) is unknown and must be 
fit by a procedure similar to the sigma-clipping used here. For (3 > 1.0, the background 
levels do not converge, in the sense that the background light is dominated by the faintest 
sources, and in these artificial images the background level is just set by the depth to which 
the artificial source catalogs are simulated. However, experiments involving changing this 
depth show that the confusion noise or level of background modulation in the artificial 
images has converged. 

Note that the sigma-clipping background estimation technique is equivalent to 
(although less subjective than) estimating the background from regions of the image that 
appear empty or blank. 

At each location of a source in the artificial source catalog, a centroid is found in the 
artificial image in a box of side length 2 ^fwhm centered on the artificial source location in 
the catalog. These centroids are what will be referred to as the "measured" positions. 

The catalog was trimmed to only "isolated sources" : The source positions measured 
by centroiding are in general shifted from the true positions in the artificial source catalog; 
sometimes this is because there is a brighter source nearby that is blending with the fainter 
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source of interest, sometimes this is just because the source is projected onto the roihng 
sea of unresolved, fainter sources. Only the latter effect is properly an effect of confusion 
noise. For this reason, sources with measured positions closer than 2 6'fwhm from a brighter 
source were dropped from the analysis that follows. This restriction to "isolated" sources 
(excluding faint sources found to be near brighter sources) removes many of the largest 
deviations of measured from true positions, especially for the artificial images with < 1, 
where bright sources dominate. 

The centroid errors for the isolated sources, in units of the beam's half-width at half 
maximum (HWHM) are shown in Figure ^ as a function of the source density, or number 
of sources per beam s/h. Really these are measured as a function of flux, but since source 
density increases with decreasing flux limit, the quantities are interchangeable; each point 
is plotted at the source density s/h which would be found in a source catalog made down 
to a limit equal to that source's flux. A running median and a running 90 percent level 
are also shown. The results are dramatic. For (3 = 1.5, which is the typical count slope 
for submillimeter sources or nearby stars, measured source positions are occasionally (10 
percent of the time) displaced from their true source positions by a significant fraction 
of the half-power point of the beam at s/b = 1/40; such displacements are common at 
s/b = 1/20. These problems are alleviated as the counts become less steep; at /3 ~ 0.75 (the 
typical count-slope for faint visual and infrared sources), the worst 10 percent of positions 
are displaced by the HWHM at s/b = 1/17. Recall that these numbers (and Figure |^) are 
computed only for isolated sources, as defined above. 

Crowding- induced centroid shifts have been observed in some microlensing events: As 
a faint lensed star becomes brighter, its apparent position shifts towards its true position 
because confusion noise becomes less important (Goldberg 1998; Goldberg & Wozniak 
1998). 

For each measured source, the gaussian beam shape is fit to the peak in a square box of 
side length 2 6'fvvhm centered on the measured centroid. This provides a measured flux. The 
measured and true fluxes are compared in Figure ^ Again the results are shown in terms 
of source density rather than flux level. The flux errors become bad at roughly the same 
source per beam levels as the position errors. Note also that at the faint end there is a bias 
in the median flux error, caused by the subtraction of a finite background from the images. 
This bias will exist in all observations that are analyzed after background subtraction (all 
visual and near-infrared images) and all observations made with chopping (many infrared 
and submillimeter images). 
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3. Discussion of assumptions 

As has been emphasized, this study makes use of very optimistic assumptions about 
the properties of the imaging data. An experiment was performed to test one of these 
assumptions: the perfectly gaussian beam. An image was made by "beam switching" with 
a nod of 3 ^fwhm so that the beam consists of a central positive gaussian flanked by two 
negative gaussians of half the power but the same FWHM separated by angles of ±3 ^fwhm 
on either side. These parameters were chosen to roughly match typical submillimeter 
observing strategies (eg, Bales et al 1999). In these artificial beam-switched images, at 
constant source density, the median positional errors are typically ~ 30 percent worse 
at /9 = 1.5 and a factor of ~ 3 worse at = 0.75, relative to the images made with the 
single gaussian beam. Pure gaussian beam may therefore be an unrealistically optimistic 
assumption, although it is close to correct for atmosphere-limited visual and near-infrared 
observations. 

The assumption of point-like or non-extended sources is overly optimistic in ground- 
based and space-based optical imaging, where recent data is not, by and large, confusion 
limited. It is probably not a problem for recent submillimeter observations, which have a 
beam with ^fwhm ~ 15 arcsec. Unfortunately, a proper treatment of the effects of finite 
source sizes involves modeling distributions of sizes, radial profiles and shapes, all as a 
function of flux; this is outside the scope of this work. 

The assumption that sources are unclustered is probably optimistic for virtually all 
deep imaging observations. Clustering becomes important whenever the angular correlation 
length is larger than or on the order of the beam size (eg, Barcons et al 1992), which is true 
for virtually all optical and near-infrared imaging. This condition is probably also met for 
the submillimeter sources, although at present the numbers of sources are too small for 
a direct measurement. Again, a full treatment requires parameterization of the angular 
clustering and its dependence on flux. 

The assumption of power-law number-flux relation must be incorrect in detail, in the 

sense that the integrated flux from sources in a power law diverges either at the faint or 
bright end (or both). In particular, the j3 = 1.5 models presented here diverge in terms of 
total flux (although not in terms of fluctuations) at the faint end. Experiments of varying 
the depth to which the artiflcial source catalogs go have not shown signiflcant changes in 
the error distributions. The errors are not dominated by the very faintest sources; they are 
dominated by the sources with fluxes which fall between the flux of the source in question 
and the level at which there is s/6 ~ 1 source per beam. It is the number-flux relation in 
this region only which is important to the confusion noise. 
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4. Detected sources 

Perhaps the single most unreahstic assumption is that the true source positions are 
known in advance; ie, the measurement of centroid shifts, above, was performed by taking 
centroids in the vicinities of the true source positions. This assumption is extremely 
optimistic, because in a real astronomical project, sources are usually detected ab initio, 
with no prior knowledge of their positions. 

To test the influence of this optimistic assumption, sources were detected in the the 
simulated images with DAOPHOT (Stetson 1987) and matched, after detection, to the 
true source positions. This matching is not unique; since position shifts are large, there are 
many faint "detected" sources which could potentially be matched with each of the faint 
"true" source positions, and vice versa. To reduce this ambibuity, the detected source flux 
can be compared with the true flux. Figure § shows the positional errors between detected 
and true sources as a function of detected source density. 

The detected sources were matched to the true source positions by taking, for each 
detected source, the closest true source with its flux between 0.67 and 1.33 of the detected 
source flux. This choice is admittedly arbitrary, but it represents a cut equivalent to 
something like 3-sigma. At bright limits, the flux cut does not affect the results, but at faint 
levels, where for every detected source there are several true source candidates, this cut 
does affect the positional errors. Figure § shows that the positions obtained by detection 
with no prior information are indeed worse than those obtained with the a priori knowledge. 
This further strengthens the statement that the confusion effects shown in this paper are in 
fact far less severe than in any real astronomical experiment. 



5. Discussion of recent data 

The deepest recent ground-based visual and near-infrared observations of blank sky 
are not confusion limited (eg, Djorgovski et al 1995; Hogg et al 1997). However, since 
they are in the region of s/h = 1/50 sources per beam, it does not make sense to perform 
deeper imaging until wide-fleld, ground-based adaptive optics can be used. Radio imaging 
has been confusion limited for some time (eg, Condon 1974) so investigators are usually 
careful to truncate analyses before confusion noise becomes destructive. In all these flelds of 
astronomy, telescope time is better spent increasing field area than depth. Although until 
the launch of Chandra essentially all deep x-ray imaging was confusion limited (eg, Barcons 
1992), present-day space-based x-ray and optical imaging is not yet at the confusion limit 
(Williams et al 1996; Brandt et al in preparation). This may change with planned future 
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instrumentation and exposures. 

Unfortunately, several recent publications on faint mid-infrared and submillimeter 
sources have ignored confusion as a possible source of error and are beyond the 
confusion limit. This is particularly serious since the number count slopes are very steep 
(2.0 < P < 1.5) both observationally and according to simple models (Hughes et al 1998; 
Blain et al 1998; Barger ct al 1999a; Ausscl ct al 1999; Elbaz et al 1999). One 850 /im study 
shows sources to s/b ~ l/H (Hughes et al 1998) and others show sources to s/b 1/50 
(Small et al 1998; Eales et al 1999). The ISO counts at 15 fim have been pushed to 
s/b ~ 1/25 (Aussel et al 1999). The fact that a significant fraction of the submillimeter 
sources show no striking visual counterparts is not at all surprising; the submillimeter 
positions will be shifted from their true positions by more than the HWHM or 7.4 arcsec. 
The authors generally consider only a region of radius ~ Ohwhm/{S/N) where (S/N) is 
the estimated signal-to-noise ratio of the detection; these radii are generally 2 to 4 arcsec 
(Hughes et al 1998; Small et al 1998; Barger et al 1999b). The results of the analysis 
presented here suggests that these authors should be looking in a region a factor of 4 to 10 
larger in solid angle. 

It might be hoped that confusion is not so destructive because perhaps the true 
mid-infrared and submillimeter source number-flux relations are not nearly as steep as what 
is measured and modeled. However, if so, many of the faintest reported sources must truly 
be spurious. One phenomenological model (Barger et al 1999a) shows the number-fiux 
relation flattening just below the faintest detected sources, but it does not flatten quickly 
enough to solve the confusion problem. Future imaging efforts would be better spent 
increasing field area than exposure times, and counterparts ought to be sought in large 
error boxes. 

A recent comprehensive review (Blain et al 1998) estimates the flux levels at which 
future ground- and space-based infrared through radio surveys will become confusion 
limited, using the incorrect criterion of s/6 = 1 source per beam. Their limiting flux levels 
become more realistic when multiplied by a factor of ~ 10 to 30 to bring the surface 
densities down to the confusion noise limits presented here. 

Another area in which deep imaging in crowded fields is necessary is studies of the 
Galactic center (eg, Eckart & Genzel 1997; Genzel et al 1997; Ghez et al 1998). In 
these studies, accurate astrometry is needed not just for identifying counterparts at other 
wavelengths, but also for measuring proper motions, on the basis of which the central black 
hole mass is estimated. Current analyses of the Galactic center go to more crowded depths 
than s/b = 10 sources per beam in the central 1 arcsec^. It may be important for the 
Galactic center investigators to show that the large proper motions they observe are truly 
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the motions of individual bright stars and not seriously affected by many small motions 
in the Tindcrlying sea of unresolved sources. If the detected motions have a significant 
confusion-induced component, it can be predicted that the stellar accelerations will deviate 
significantly from their gravitational expectations. This prediction may already have been 
falsified, at least at bright levels (Ghez et al 2000). 

6. Conclusions 

For typical faint imaging in the visual and near-infrared, in which number counts 
have the form dlogN/dlogS = with P ^ 0.75, the confusion limit rule-of-thumb that 
imaging should not be pursued much fainter than s/b ^ 1/30 sources per beam is essentially 
correct, both for obtaining good positions and good photometry. Optimistic simulations 
show that positions and fiuxes of sources more numerous than this condition are likely to 
have large uncertainties. When number counts are steep, with Euclidean (3 — 1.5 or steeper, 
the problem is even more severe and a better rule-of-thumb is something like s/6 ~ 1/50. 

Source identifications in one set of imaging data based on detections in another set 
will be affected by these confusion-induced astrometry errors. It is essential that surveys 
working near the confusion limit perform realistic simulations (which include the sources 
well faint of any detection limits) in order to draw conservative positional error boxes for 
source identification. 

Gerry Neugebauer and Tom Soifer drummed the "30 beams per source" rule-of-thumb 
into my head. Useful comments, code and information came from Tal Alexander, John 
Bahcall, Roger Blandford, Tom Chester, Judy Cohen, Jim Condon, Daniel Eisenstein, Tom 
Jarrett, Wayne Landsman, Bruce Partridge, Eric Richards, Douglas Scott and Ian Small. 
Financial support was provided under Hubble Fellowship grant HF-01093.01-97A from 
STScI, which is operated by AURA under NASA contract NAS 5-26555. This research 
made use of the NASA ADS Abstract Service. 



REFERENCES 

Aussel H., Cesarsky C. J., Elbaz D., Starck J. L., 1999, A&A, 342, 313 
Barcons X., 1992, ApJ, 396, 460 

Barger A. J., Cowie L. L., Sanders D. B., Fulton E., Taniguchi Y., Sato Y., Kawara K., 
Okuda H., 1998, Nature, 394, 248 



-9- 



Barger A. J., Cowie L. L., Sanders D. B., 1999a, ApJ, 518, L5 

Barger A. J., Cowie L. L., Smail I., Ivison R. J., Blain A. W., Kneib J.-R, 1999b, AJ, 117, 
2656 

Blain A. W., Ivison R. J., Smail I., 1998, MNRAS, 296, L29 
Condon J. J., 1974, ApJ, 188, 279 
Djorgovski S. ct al, 1995, ApJ, 438, L13 

Eales S., Lilly S., Gear W., Dunne L., Bond J. R., Hammer F., Le Fevre O., Crampton D., 

1999, ApJ, 515, 518 

Eckart A., Genzel R., 1997, MNRAS, 284, 576 
Elbaz D. et al, 1999, A&A, 351, L37 
Franceschini A., 1982, Ap&SS, 86, 3 
Gardner J. P. et al, 2000, AJ, 119, 486 

Genzel R., Eckart A., Ott T., Eisenhauer F., 1997, MNRAS, 291, 219 
Ghez A. M., Klein B. L., Morris M., Becklin E. E., 1998, ApJ, 509, 676 
Chez A., Becklin E. E., Kremenek T., Tanner A., 2000, Nature, 407, 349 
Goldberg D. M., 1998, ApJ, 498, 156 

Goldberg D. M., Wozniak R R., 1998, Acta Astronomica, 48, 19 
Hacking P., Houck J. R., 1987, ApJS, 63, 311 

Hogg D. W., Pahre M. A., McCarthy J. K., Cohen J. G., Blandford R., Smail I., Soifer B. 
T., 1997, MNRAS, 288, 404 

Hughes D. H. et al, 1998, Nature, 394, 241 

Rajagopal J., Allen R. J., 2000, in Working on the Fringe: An International Conference on 
Optical and IR Interferometry from Ground and Space, eds. Unwin S., Stachnik R., 
ASP Conference Series, in press 

Richards E. A., Kellerman K. I., Fomalont E. B., Windhorst R. A., Partridge R. B., 1998, 
AJ, 116, 1039 

Rowan-Robinson M. et al, 1997, MNRAS, 289, 490 

Scheuer P. A. G., 1957, Proc. Camb. Phil. Soc, 53, 764 

Stetson, P. B. 1987, PASP, 99, 191 

Smail I., Ivison R. J., Blain A. W., Kneib J.-R, 1998, ApJ, 507, L21 
WiUiams R. E. et al, 1996, AJ, 112, 1335 



- 10- 

Yu J. W., Shaklan S. B., Shao M., 1993, Proc. SPIE, 1947, 209 



This preprint was prepared with the AAS IM^jX macros v4.0. 




Fig. 1. — The four artificial images, labeled by the number-count exponent /5 (see text for 
definition). The images are stretched so that a source at the s/h — 1/30 source per beam 
level appears the same in all images. 
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Fig. 2. — Astrometry (position) errors as a function of the number of sources per beam, for 
isolated sources (see text), given in terms of the beam HWHM. These positional errors can 
be thought-of as really being as a function of flux, but where for clarity the flux is given not 
in Jy but by the source density s/b to that flux level. The lower line is a nmning median 
and the upper line is a running 90-pcrccnt line. The panels are labeled by number count 
slope f3 (sec text for definition). The median positional error in the /? = 1.5 case is roughly 
0.6 6'hwhm for sources with fluxes such that a catalog to that level would contain s/b = 1/30 
of a source per beam. 
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Fig. 3. — Fractional flux errors, for isolated sources (see text), as a function of the number 
of sources per beam. The middle line is a running median and the upper and lower lines 
are running 10 and 90-percent lines. The panels are labeled by number count slope f3 (see 
text for definition). The median in the large-/? figures appears above the majority of the 
points because there are a significant number of detected sources with no true source within 
1.5HWHM. 
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Fig. 4. — Astrometry (position) errors as a function of the number of detected sources per 
beam, given in terms of the beam HWHM, as in Figure 0, but now for sources detected with 
no a priori knowledge of their positions. The detected sources were matched to true source 
positions using flux cuts; see text for details. The lower line is a running median and the 
upper line is a running 90-percent line. The panels are labeled by true number count slope 
P (see text for definition). 



